TESLA: A Tool for Annotating Geospatial Language Corpora
نویسندگان
چکیده
In this paper, we present The gEoSpatial Language Annotator (TESLA)—a tool which supports human annotation of geospatial language corpora. TESLA interfaces with a GIS database for annotating grounded geospatial entities and uses Google Earth for visualization of both entity search results and evolving object and speaker position from GPS tracks. We also discuss a current annotation effort using TESLA to annotate location descriptions in a geospatial language corpus.
منابع مشابه
BECAM tool - a semi-automatic tool for bootstrapping emotion corpus annotation and management
Corpus annotation is an important aspect in speech applications where stochastic models need to be trained and evaluated. Multimodal corpora are also annotated. Moreover, corpus annotation is an essential phase in the construction of emotion recognizer engines. Large corpora, as they are essential to construct representative knowledge bases, have been a problem for corpus annotators. Time consu...
متن کاملeBonsai: An Integrated Environment for Annotating Treebanks
Syntactically annotated corpora (treebanks) play an important role in recent statistical natural language processing. However, building a large treebank is labor intensive and time consuming work. To remedy this problem, there have been many attempts to develop software tools for annotating treebanks. This paper presents an integrated environment for annotating a treebank, called eBonsai. eBons...
متن کاملBuilding and Using Corpora of Non-Native Czech
Investigating language acquisition by non-native learners helps to understand important linguistic issues and develop teaching methods, better suited both to the specific target language and to the learner. These tasks can now be based on empirical evidence from learner corpora. A learner corpus consists of language produced by language learners, typically learners of a second or foreign langua...
متن کاملPreliminary Experience with Amazon’s Mechanical Turk for Annotating Medical Named Entities
Amazon’s Mechanical Turk (MTurk) service is becoming increasingly popular in Natural Language Processing (NLP) research. In this paper, we report our findings in using MTurk to annotate medical text extracted from clinical trial descriptions with three entity types: medical condition, medication, and laboratory test. We compared MTurk annotations with a gold standard manually created by a domai...
متن کاملA Web Tool for Building Parallel Corpora of Spoken and Sign Languages
In this paper we describe our work in building an online tool for manually annotating texts in any spoken language with SignWriting in any sign language. The existence of such tool will allow the creation of parallel corpora between spoken and sign languages that can be used to bootstrap the creation of efficient tools for the Deaf community. As an example, a parallel corpus between English and...
متن کامل